#library(scDown)

1 Define Universal Variables for scVelo Part II

Set up the environment for running scVelo Part II by defining key variables.

Loading a .h5ad object containing spliced and unspliced RNA data is required. The cell type annotation column annotation_column is defined to specify the cell type or other annotation labels, ensuring the analysis runs with the correct group.

The working directory output_dir by default is the current directory and can be changed to specific path.

The mode to conduct scvelo velocity calculation can be either ‘stochastic (default)’, ‘deterministic’, or ‘dynamical (slowest)’

# Set the working directory
output_dir="/lab-share/RC-Data-Science-e2/Public/Qianyi/test_pipeline/scdown/scDown/tests"

# input h5ad file path and name
h5ad_file="inst/extdata/DentateGyrus/10X43_1.h5ad"
# specify which metadata column of the h5ad object contains cell type annotation 
annotation_column <- "clusters"

# Mode to conduct scvelo velocity calculation, default 'stochastic'
mode = 'stochastic'

# The number of top differential velocity genes to plot phase portrait for, default 5
top_gene = 5

2 Run scVelo Part II

This part performs RNA velocity calculations from .h5ad file using the original scVelo python package.

Workflow of run_scvelo_full(): 1. calculate RNA velocity using scVelo workflow 2. cluster-specific differential velocity genes 3. trajectory inference using PAGA

start_time <- proc.time()

# Run scVelo Part II
run_scvelo_full(h5ad_file = h5ad_file,
                output_dir = output_dir,
                annotation_column = annotation_column)

end_time <- proc.time()
elapsed_time <- end_time - start_time

# Print the elapsed time
print(elapsed_time)
# elapsed_time: 2 min

2.1 Calculate RNA Velocity using scVelo

This step takes in an AnnData object in .h5ad and performs all basic velocity calculations enabled by scVelo. It also outputs basic figures such as spliced/unspliced count proportion and RNA velocity vectors on umap.

This plot visualizes the % spliced Vs. unspliced RNA for each cell type.

This plot visualizes the velocity stream on UMAP embeddings.

This plot visualizes the vector grid on UMAP embeddings.

This plot visualizes the vector arrow on UMAP embeddings.

2.2 Cluster-specific Differential Velocity Genes

This step performs a differential velocity t-test to find genes that explain the directionality of calculated velocity vectors. It tests which genes have cell type-specific differential velocity expression, i.e., being siginificantly higher/lower compared to the remaining population, and visualizes the phase portrait (ratio of spliced/unspliced RNA abundance) for highly ranked genes, i.e., the unspliced mRNAs (y-axis) Vs. spliced mRNAs (x-axis). Transcriptional induction for a particular gene results in an increase of (newly transcribed) precursor unspliced mRNAs.

This plot visualizes the phase portrait for top 5 highly ranked differential velocity genes for each cell type.

# cluster: Astrocytes

# cluster: Cajal Retzius

# cluster: Cck-Tox

# cluster: Endothelial

# cluster: GABA

# cluster: Granule immature

# cluster: Granule mature

# cluster: Microglia

# cluster: Mossy

# cluster: Neuroblast

# cluster: nIPC

# cluster: OL

# cluster: Neuroblast

2.3 Trajectory Inference using PAGA

This step performs trajectory inference using the PAGA method. It provides a graph-like map of the data with solid edges corresponding to the transition confidence between two cell type groups (defined in annotation_column). Here, PAGA is extended by velocity-inferred directionality and predicts transitions/lineages between groups.

This plot visualizes the directed graphs of predicted lineages.

3 Summary

This document outlines a pipeline for analyzing RNA velocity directly employing the original Python package scVelo. The pipeline begins with the .h5ad file with spliced and unspliced RNA matrices that could be generated by run_scvelo(). It runs scvelo by fitting a stochastic model to each gene’s splicing dynamics and visualizes RNA velocity in stream plots, grid plots and arraw plots. It also identifies differential velocity genes via statistical testing and visualizes top-ranked genes in phase portraits, highlighting those influencing state transitions.

Apart from that, it conducts Partition-based Graph Abstraction (PAGA), which calculates transition probabilities and generates directed lineage graphs, providing a high-level overview of cell-state relationships, supporting lineage and trajectory inference.

This python-based workflow offers a robust framework for analyzing transcriptional dynamics, providing insights into cellular trajectories and transitions.

The vignette was performed on a machine with the following specifications: